Dynamic scheduling of Web API calls

ABSTRACT

In some embodiments, a method may involve a computing system sending, to a client device, first data indicating that the client device is authorized to send an application programming interface (API) call to the computing system during at least a first time slot. The computing system may receive, from the client device, a first API call during the first time slot, and may process the first API call. In some embodiments, a method may involve a computing system sending, to a client device, first data indicating that the client device is prohibited from sending an API call to the computing system during at least a first time slot. The computing system may receive, from the client device, a first API call during a second time slot that is different than the first time slot, and may process the first API call.

BACKGROUND

Various systems have been developed that allow client devices to accessapplications and/or data files over a network. Certain products offeredby Citrix Systems, Inc., of Fort Lauderdale, Fla., including the CitrixWorkspace™ and Citrix ShareFile® families of products, provide suchcapabilities. Some such systems employ applications or services that canbe accessed over the internet via Web application programming interface(Web API) calls from client devices or systems.

SUMMARY

This Summary is provided to introduce a selection of concepts in asimplified form that are further described below in the DetailedDescription. This Summary is not intended to identify key features oressential features, nor is it intended to limit the scope of the claimsincluded herewith.

In some of the disclosed embodiments, a method involves sending, from acomputing system to a client device, a first token that includes firstdata indicating that the client device is authorized to send anapplication programming interface (API) call to the computing systemduring at least a first time slot, the first token including a firstsignature based on the first data and a private key of the computingsystem. The computing system receives a first API call during the firsttime slot, and determines that the first API call includes a secondtoken, the second token including second data and a second signature.The computing system uses the private key and the second data todetermine that the second signature is valid, and determines that thesecond data indicates that the client device was authorized to send thefirst API call to the computing system during the first time slot. Thecomputing system processes the first API call based at least in part thesecond data indicating that the client device was authorized to send thefirst API call to the computing system during the first time slot.

In some disclosed embodiments, a method involves sending, from acomputing system to a client device, first data indicating that theclient device is authorized to send an application programming interface(API) call to the computing system during at least a first time slot;receiving, by the computing system and from the client device, a firstAPI call during the first time slot; and processing, by the computingsystem, the first API call.

In some disclosed embodiments, a method involves sending, from acomputing system to a client device, first data indicating that theclient device is prohibited from sending an API call to the computingsystem during at least a first time slot; receiving, by the computingsystem and from the client device, a first API call during a second timeslot that is different than the first time slot; and processing, by thecomputing system, the first API call.

In some disclosed embodiments, a system comprises at least oneprocessor, and at least one computer-readable medium encoded withinstructions which, when executed by the at least one processor, causethe system to send, to a client device, first data indicating that (A)the client device is authorized to send an application programminginterface (API) call to the system during at least a first time slot, or(B) the client device is prohibited from sending an API call to thesystem during at least a second time slot that is different than thefirst time slot, to receive, from the client device, a first API callduring the first time slot, and to process, by the system, the first APIcall.

BRIEF DESCRIPTION OF THE DRAWINGS

Objects, aspects, features, and advantages of embodiments disclosedherein will become more fully apparent from the following detaileddescription, the appended claims, and the accompanying figures in whichlike reference numerals identify similar or identical elements.Reference numerals that are introduced in the specification inassociation with a figure may be repeated in one or more subsequentfigures without additional description in the specification in order toprovide context for other features, and not every element may be labeledin every figure. The drawings are not necessarily to scale, emphasisinstead being placed upon illustrating embodiments, principles andconcepts. The drawings are not intended to limit the scope of the claimsincluded herewith.

FIG. 1 is a high-level illustration of an intelligent Web API callmanagement system configured in accordance with some aspects of thepresent disclosure;

FIG. 2 is a diagram of a network environment in which some embodimentsof the intelligent Web API call management system disclosed herein maydeployed;

FIG. 3 is a block diagram of a computing system that may be used toimplement one or more of the components of the computing environmentshown in FIG. 2 in accordance with some embodiments;

FIG. 4 is a schematic block diagram of a cloud computing environment inwhich various aspects of the disclosure may be implemented;

FIG. 5 is block diagram illustrating example components that maybe usedto implement some embodiment of the intelligent Web API call managementsystem shown in FIG. 1 ;

FIG. 6 illustrates features of an example policy token that may beemployed to communicate call policy information to clients in accordancewith some embodiments of the present disclosure;

FIG. 7 is a flowchart illustrating an example routine that may beperformed by the gateway shown in FIG. 5 in accordance with someembodiments of the present disclosure; and

FIG. 8 shows an example table that may be maintained and used by thegateway shown in FIG. 5 in connection with the routine shown in FIG. 7 .

DETAILED DESCRIPTION

For purposes of reading the description of the various embodimentsbelow, the following descriptions of the sections of the specificationand their respective contents may be helpful:

Section A provides an introduction to example embodiments of anintelligent Web API call management system configured in accordance withsome aspects of the present disclosure;

Section B describes a network environment which may be useful forpracticing embodiments described herein;

Section C describes a computing system which may be useful forpracticing embodiments described herein;

Section D describes embodiments of systems and methods for accessingcomputing resources using a cloud computing environment;

Section E provides a more detailed description of example embodiments ofthe intelligent Web API call management system introduced in Section A;and

Section F describes example implementations of methods, systems/devices,and computer-readable media in accordance with the present disclosure.

A. Introduction to Illustrative Embodiments of an Intelligent Web APICall Management System

Oftentimes services expose capabilities to clients as Web APIs. Clientsmay need to call such Web APIs periodically (hourly, daily, weekly,etc.) to achieve some business intent, e.g., to get a lease for using abusiness capability or to renew an existing lease. Other examples ofsuch an API interaction pattern could be periodically reporting thehealth/presence of a client device (beacon), polling for someinformation from a server, refreshing an access token, etc. With thiskind of uncoordinated or otherwise random pattern of interaction, it isa challenge to service API traffic during a particular time window.Spike(s) or overload of traffic at any one time may result in eitheroutages or over-provisioning on the server side during some time periodsand under-utilization during other time periods. This problem canmanifest in any circumstance in which a group of N-clients makes Web APIcalls to a service periodically.

One approach for addressing this issue is to implement rate-limiting ofAPI calls on the server side. Such a measure can help prevent “bad”clients from bombarding the service with requests by limiting the numberof calls they can make within a given time period. However, becauseclients independently determine the times at which individual API callsare sent, this approach cannot prevent a large group of “good” clientsfrom making legitimate calls (i.e., within their rate-limit) at the sametime and thus overloading the server.

As used herein, the term “good clients” refers to the clients thataccess the service in good faith by following the service specification,purpose, and guidance. The term “bad clients,” on the other hand, refersto the clients that abuse and put additional stress on the service bynot following the service specification, purpose and/or guidance. Insome circumstances, bad clients may even deliberately perpetrate adenial of service (DoS) cyber-attack by bombarding a service with APIcalls in an effort to crash the service and/or prevent good clients fromaccessing it. It should be appreciated, however, that bad clients arenot necessarily unauthorized clients trying to get access to theservice. They may also be authenticated and authorized clients makingunnecessary requests too often, out-of-order, etc.

On the server side, it may be possible to do optimizations likepre-processing/caching etc., so that each request can be processedfaster and thereby increase throughput. A service may also bringadditional resources on line to accommodate periods of increased clientrequests. Such approaches do not, however, prevent multiple clients fromissuing legitimate requests to the service at the same time, so theabove problem of overloading the server still remains.

Existing server-side solutions tend to be reactive in nature because theservers either drop calls when their capacity (e.g., a threshold) isreached or elastically scale to be able to meet the increased need at afuture time but do not allow the processing of current calls. None ofthem take steps to proactively prevent the problem.

On the client side, a possible approach is for individual clients torandomize the times at which they send requests. Given that individualclients make the decision when to send their respective requests inisolation, however, such an approach may or may not help because it isstill possible for a large number of clients to issue Web API calls atthe same time. In short, present client side solutions do not solve theabove-described problem because the receipt of such calls is stillrandom and uncoordinated, which can cause difficulty in theirprocessing.

Offered is a system in which a web service can exercise at least aminimal degree of control over the timing of Web API calls that itreceives from respective clients, thus enabling the service tocoordinate or otherwise orchestrate such calls to be spread over timeand minimize the likelihood that the service will receive an undulylarge number of API calls at the same time. FIG. 1 shows a high-levelimplementation of an intelligent Web API call management system 100 inwhich a web service 102 (e.g., implemented by one or more servers 204,examples of which are described below) may exercise control over thetiming of Web API calls 104 that are sent by one or more clients 202,examples of which are described below, in such a manner.

As shown, in some implementations, the web service 102 may provide callpolicy information 106 to respective client devices 202 that specifiesone or more policies concerning times at which and/or circumstancesunder which the client devices 202 are authorized to send the Web APIcalls 104 to the web service 102. As explained in more detail below, thecall policy information 106 sent to respective clients may bedynamically determined by the web service 102 so as to ensure that theclient devices 202 collectively distribute their Web API calls 104 overtime, thus minimizing the risk that the web service 102 will beinundated by an unduly large number of Web API calls 104 at the sametime. In some implementations, for example, the web service 102 may setthe call policy information 106 so that respective clients 202 areinformed of particular allowed time slots, periods, intervals, phases,openings and the like (alternatively referred to herein as “allowedslots”) during which they are authorized to send Web API calls 104 tothe web service 102 and/or of particular blocked time slots or the like(alternatively referred to herein as “blocked slots”) during which theyare prohibited from sending Web API calls 104 to the web service 102.

By keeping track of such slot assignments as they are made, the webservice 102 may maintain a forward looking schedule indicating theapproximate numbers of Web API calls that are expected to occur,assuming the clients follow the call policy information 106 sent tothem, during respective time slots within some future period of time(e.g., for one hour slots during the following twenty-four hours), andcan set the call policy information 106 that is subsequently sent toclients 202 to ensure that the total numbers of expected calls for thedifferent time slots on the schedule are approximately the same, or areotherwise appropriately distributed amongst the future time slots inview of the expected capacity of the system during such time slots. Insome implementations, the web service 102 may further track historicaldata concerning the numbers of Web API calls 104 that are actuallyprocessed during respective historical time slots, and may further usethat historical data to determine how to best scale the system to meetexpected demands during corresponding time slots in the future, and/orto determine how the system is likely to be scaled during particulartime slots for purposes of determining how best to set the call policyinformation 106, as discussed above.

In some implementations, the call policy information 106 may be includedas a part of the API responses (e.g., as an attribute in the header orbody of the response) the web service 102 sends to the clients 202 inresponse to respective Web API calls 104. As explained in more detailbelow, for example, in some implementations the call policy information106 may be included within a token (e.g., a JSON Web Token (JWT)) orother credential, data fragment, etc., that may be appended to such APIresponses. In other implementations, the web service 102 service mayadditionally or alternatively expose a dedicated endpoint from which theclients 202 may retrieve the call policy information 106. Allowing theclient devices 202 to retrieve the call policy information 106 on demandfrom a dedicated endpoint may be more suitable for relatively staticsituations, where the call policy information 106 does not changesignificantly over time. Including the call policy information 106within respective API responses, on the other hand, may make it easierfor the web service 102 to dynamically change the call policyinformation 106 after receipt of individual Web API calls 104 and adaptthe call policy information 106 to current load, bandwidth, resourceavailability, etc.

FIG. 1 further shows an example routine 108 that may be performed by oneor more components of the web service 102 in accordance with someembodiments of the present disclosure. As shown, at a step 110 of theroutine 108, a computing system (e.g., the web service 102) may sendfirst data (e.g., the call policy information 106) to a client device202. The first data (e.g., the call policy information 106) may indicatethat (A) the client device 202 is authorized to send an API call (e.g.,a Web API call 104) to the computing system (e.g., the web service 102)during at least a first time slot (e.g., an allowed slot) and/or (B) theclient device 202 is prohibited from sending an API call to thecomputing system (e.g., the web service 102) during at least a secondtime slot (e.g., a blocked slot) that is different than the first timeslot.

At a step 112 of the routine 108, the computing system (e.g., the webservice 102) may receive, from the client device 202, a first API call(e.g., a Web API call 104) during the first time slot (e.g., an allowedslot and/or a non-blocked slot).

At a step 114 of the routine 108, the computing system (e.g., the webservice 102) may process the first API call (e.g., a Web API call 104).

Additional details and example implementations of embodiments of thepresent disclosure are set forth below in Section F, following adescription of example systems and network environments in which suchembodiments may be deployed.

B. Network Environment

Referring to FIG. 2 , an illustrative network environment 200 isdepicted. As shown, the network environment 200 may include one or moreclients 202(1)-202(n) (also generally referred to as local machine(s)202 or client(s) 202) in communication with one or more servers204(1)-204(n) (also generally referred to as remote machine(s) 204 orserver(s) 204) via one or more networks 206(1)-206(n) (generallyreferred to as network(s) 206). In some embodiments, a client 202 maycommunicate with a server 204 via one or more appliances 208(1)-208(n)(generally referred to as appliance(s) 208 or gateway(s) 208). In someembodiments, a client 202 may have the capacity to function as both aclient node seeking access to resources provided by a server 204 and asa server 204 providing access to hosted resources for other clients 202.

Although the embodiment shown in FIG. 2 shows one or more networks 206between the clients 202 and the servers 204, in other embodiments, theclients 202 and the servers 204 may be on the same network 206. Whenmultiple networks 206 are employed, the various networks 206 may be thesame type of network or different types of networks. For example, insome embodiments, the networks 206(1) and 206(n) may be private networkssuch as local area network (LANs) or company Intranets, while thenetwork 206(2) may be a public network, such as a metropolitan areanetwork (MAN), wide area network (WAN), or the Internet. In otherembodiments, one or both of the network 206(1) and the network 206(n),as well as the network 206(2), may be public networks. In yet otherembodiments, all three of the network 206(1), the network 206(2) and thenetwork 206(n) may be private networks. The networks 206 may employ oneor more types of physical networks and/or network topologies, such aswired and/or wireless networks, and may employ one or more communicationtransport protocols, such as transmission control protocol (TCP),internet protocol (IP), user datagram protocol (UDP) or other similarprotocols. In some embodiments, the network(s) 206 may include one ormore mobile telephone networks that use various protocols to communicateamong mobile devices. In some embodiments, the network(s) 206 mayinclude one or more wireless local-area networks (WLANs). For shortrange communications within a WLAN, clients 202 may communicate using802.11, Bluetooth, and/or Near Field Communication (NFC).

As shown in FIG. 2 , one or more appliances 208 may be located atvarious points or in various communication paths of the networkenvironment 200. For example, the appliance 208(1) may be deployedbetween the network 206(1) and the network 206(2), and the appliance208(n) may be deployed between the network 206(2) and the network206(n). In some embodiments, the appliances 208 may communicate with oneanother and work in conjunction to, for example, accelerate networktraffic between the clients 202 and the servers 204. In someembodiments, appliances 208 may act as a gateway between two or morenetworks. In other embodiments, one or more of the appliances 208 mayinstead be implemented in conjunction with or as part of a single one ofthe clients 202 or servers 204 to allow such device to connect directlyto one of the networks 206. In some embodiments, one of more appliances208 may operate as an application delivery controller (ADC) to provideone or more of the clients 202 with access to business applications andother data deployed in a datacenter, the cloud, or delivered as Softwareas a Service (SaaS) across a range of client devices, and/or provideother functionality such as load balancing, etc. In some embodiments,one or more of the appliances 208 may be implemented as network devicessold by Citrix Systems, Inc., of Fort Lauderdale, Fla., such as CitrixGateway™ or Citrix ADC™.

A server 204 may be any server type such as, for example: a file server;an application server; a web server; a proxy server; an appliance; anetwork appliance; a gateway; an application gateway; a gateway server;a virtualization server; a deployment server; a Secure Sockets LayerVirtual Private Network (SSL VPN) server; a firewall; a web server; aserver executing an active directory; a cloud server; or a serverexecuting an application acceleration program that provides firewallfunctionality, application functionality, or load balancingfunctionality.

A server 204 may execute, operate or otherwise provide an applicationthat may be any one of the following: software; a program; executableinstructions; a virtual machine; a hypervisor; a web browser; aweb-based client; a client-server application; a thin-client computingclient; an ActiveX control; a Java applet; software related to voiceover internet protocol (VoIP) communications like a soft IP telephone;an application for streaming video and/or audio; an application forfacilitating real-time-data communications; a HTTP client; a FTP client;an Oscar client; a Telnet client; or any other set of executableinstructions.

In some embodiments, a server 204 may execute a remote presentationservices program or other program that uses a thin-client or aremote-display protocol to capture display output generated by anapplication executing on a server 204 and transmit the applicationdisplay output to a client device 202.

In yet other embodiments, a server 204 may execute a virtual machineproviding, to a user of a client 202, access to a computing environment.The client 202 may be a virtual machine. The virtual machine may bemanaged by, for example, a hypervisor, a virtual machine manager (VMM),or any other hardware virtualization technique within the server 204.

As shown in FIG. 2 , in some embodiments, groups of the servers 204 mayoperate as one or more server farms 210. The servers 204 of such serverfarms 210 may be logically grouped, and may either be geographicallyco-located (e.g., on premises) or geographically dispersed (e.g., cloudbased) from the clients 202 and/or other servers 204. In someembodiments, two or more server farms 210 may communicate with oneanother, e.g., via respective appliances 208 connected to the network206(2), to allow multiple server-based processes to interact with oneanother.

As also shown in FIG. 2 , in some embodiments, one or more of theappliances 208 may include, be replaced by, or be in communication with,one or more additional appliances, such as WAN optimization appliances212(1)-212(n), referred to generally as WAN optimization appliance(s)212. For example, WAN optimization appliances 212 may accelerate, cache,compress or otherwise optimize or improve performance, operation, flowcontrol, or quality of service of network traffic, such as traffic toand/or from a WAN connection, such as optimizing Wide Area File Services(WAFS), accelerating Server Message Block (SMB) or Common Internet FileSystem (CIFS). In some embodiments, one or more of the appliances 212may be a performance enhancing proxy or a WAN optimization controller.

In some embodiments, one or more of the appliances 208, 212 may beimplemented as products sold by Citrix Systems, Inc., of FortLauderdale, Fla., such as Citrix SD-WAN™ or Citrix Cloud™. For example,in some implementations, one or more of the appliances 208, 212 may becloud connectors that enable communications to be exchanged betweenresources within a cloud computing environment and resources outsidesuch an environment, e.g., resources hosted within a data center of + anorganization.

C. Computing Environment

FIG. 3 illustrates an example of a computing system 300 that may be usedto implement one or more of the respective components (e.g., the clients202, the servers 204, the appliances 208, 212) within the networkenvironment 200 shown in FIG. 2 . As shown in FIG. 3 , the computingsystem 300 may include one or more processors 302, volatile memory 304(e.g., RAM), non-volatile memory 306 (e.g., one or more hard disk drives(HDDs) or other magnetic or optical storage media, one or more solidstate drives (SSDs) such as a flash drive or other solid state storagemedia, one or more hybrid magnetic and solid state drives, and/or one ormore virtual storage volumes, such as a cloud storage, or a combinationof such physical storage volumes and virtual storage volumes or arraysthereof), a user interface (UI) 308, one or more communicationsinterfaces 310, and a communication bus 312. The user interface 308 mayinclude a graphical user interface (GUI) 314 (e.g., a touchscreen, adisplay, etc.) and one or more input/output (I/O) devices 316 (e.g., amouse, a keyboard, etc.). The non-volatile memory 306 may store anoperating system 318, one or more applications 320, and data 322 suchthat, for example, computer instructions of the operating system 318and/or applications 320 are executed by the processor(s) 302 out of thevolatile memory 304. Data may be entered using an input device of theGUI 314 or received from I/O device(s) 316. Various elements of thecomputing system 300 may communicate via communication the bus 312. Thecomputing system 300 as shown in FIG. 3 is shown merely as an example,as the clients 202, servers 204 and/or appliances 208 and 212 may beimplemented by any computing or processing environment and with any typeof machine or set of machines that may have suitable hardware and/orsoftware capable of operating as described herein.

The processor(s) 302 may be implemented by one or more programmableprocessors executing one or more computer programs to perform thefunctions of the system. As used herein, the term “processor” describesan electronic circuit that performs a function, an operation, or asequence of operations. The function, operation, or sequence ofoperations may be hard coded into the electronic circuit or soft codedby way of instructions held in a memory device. A “processor” mayperform the function, operation, or sequence of operations using digitalvalues or using analog signals. In some embodiments, the “processor” canbe embodied in one or more application specific integrated circuits(ASICs), microprocessors, digital signal processors, microcontrollers,field programmable gate arrays (FPGAs), programmable logic arrays(PLAs), multi-core processors, or general-purpose computers withassociated memory. The “processor” may be analog, digital ormixed-signal. In some embodiments, the “processor” may be one or morephysical processors or one or more “virtual” (e.g., remotely located or“cloud”) processors.

The communications interfaces 310 may include one or more interfaces toenable the computing system 300 to access a computer network such as aLocal Area Network (LAN), a Wide Area Network (WAN), a Personal AreaNetwork (PAN), or the Internet through a variety of wired and/orwireless connections, including cellular connections.

As noted above, in some embodiments, one or more computing systems 300may execute an application on behalf of a user of a client computingdevice (e.g., a client 202 shown in FIG. 2 ), may execute a virtualmachine, which provides an execution session within which applicationsexecute on behalf of a user or a client computing device (e.g., a client202 shown in FIG. 2 ), such as a hosted desktop session, may execute aterminal services session to provide a hosted desktop environment, ormay provide access to a computing environment including one or more of:one or more applications, one or more desktop applications, and one ormore desktop sessions in which one or more applications may execute.

D. Systems and Methods for Delivering Shared Resources Using a CloudComputing Environment

Referring to FIG. 4 , a cloud computing environment 400 is depicted,which may also be referred to as a cloud environment, cloud computing orcloud network. The cloud computing environment 400 can provide thedelivery of shared computing services and/or resources to multiple usersor tenants. For example, the shared resources and services can include,but are not limited to, networks, network bandwidth, servers,processing, memory, storage, applications, virtual machines, databases,software, hardware, analytics, and intelligence.

In the cloud computing environment 400, one or more clients 202 (such asthose described in connection with FIG. 2 ) are in communication with acloud network 404. The cloud network 404 may include back-end platforms,e.g., servers, storage, server farms and/or data centers. The clients202 may correspond to a single organization/tenant or multipleorganizations/tenants. More particularly, in one example implementation,the cloud computing environment 400 may provide a private cloud servinga single organization (e.g., enterprise cloud). In another example, thecloud computing environment 400 may provide a community or public cloudserving multiple organizations/tenants.

In some embodiments, a gateway appliance(s) or service may be utilizedto provide access to cloud computing resources and virtual sessions. Byway of example, Citrix Gateway, provided by Citrix Systems, Inc., may bedeployed on-premises or on public clouds to provide users with secureaccess and single sign-on to virtual, SaaS and web applications.Furthermore, to protect users from web threats, a gateway such as CitrixSecure Web Gateway may be used. Citrix Secure Web Gateway uses acloud-based service and a local cache to check for URL reputation andcategory.

In still further embodiments, the cloud computing environment 400 mayprovide a hybrid cloud that is a combination of a public cloud and oneor more resources located outside such a cloud, such as resources hostedwithin one or more data centers of an organization. Public clouds mayinclude public servers that are maintained by third parties to theclients 202 or the enterprise/tenant. The servers may be locatedoff-site in remote geographical locations or otherwise. In someimplementations, one or more cloud connectors may be used to facilitatethe exchange of communications between one more resources within thecloud computing environment 400 and one or more resources outside ofsuch an environment.

The cloud computing environment 400 can provide resource pooling toserve multiple users via clients 202 through a multi-tenant environmentor multi-tenant model with different physical and virtual resourcesdynamically assigned and reassigned responsive to different demandswithin the respective environment. The multi-tenant environment caninclude a system or architecture that can provide a single instance ofsoftware, an application or a software application to serve multipleusers. In some embodiments, the cloud computing environment 400 canprovide on-demand self-service to unilaterally provision computingcapabilities (e.g., server time, network storage) across a network formultiple clients 202. By way of example, provisioning services may beprovided through a system such as Citrix Provisioning Services (CitrixPVS). Citrix PVS is a software-streaming technology that deliverspatches, updates, and other configuration information to multiplevirtual desktop endpoints through a shared desktop image. The cloudcomputing environment 400 can provide an elasticity to dynamically scaleout or scale in response to different demands from one or more clients202. In some embodiments, the cloud computing environment 400 mayinclude or provide monitoring services to monitor, control and/orgenerate reports corresponding to the provided shared services andresources.

In some embodiments, the cloud computing environment 400 may providecloud-based delivery of different types of cloud computing services,such as Software as a service (SaaS) 402, Platform as a Service (PaaS)404, Infrastructure as a Service (IaaS) 406, and Desktop as a Service(DaaS) 408, for example. IaaS may refer to a user renting the use ofinfrastructure resources that are needed during a specified time period.IaaS providers may offer storage, networking, servers or virtualizationresources from large pools, allowing the users to quickly scale up byaccessing more resources as needed. Examples of IaaS platforms includeAMAZON WEB SERVICES provided by Amazon.com, Inc., of Seattle, Wash.,Azure IaaS provided by Microsoft Corporation or Redmond, Wash.,RACKSPACE CLOUD provided by Rackspace US, Inc., of San Antonio, Tex.,Google Compute Engine provided by Google Inc. of Mountain View, Calif.,and RIGHTSCALE provided by RightScale, Inc., of Santa Barbara, Calif.

PaaS providers may offer functionality provided by IaaS, including,e.g., storage, networking, servers or virtualization, as well asadditional resources such as, e.g., the operating system, middleware, orruntime resources. Examples of PaaS include WINDOWS AZURE provided byMicrosoft Corporation of Redmond, Wash., Google App Engine provided byGoogle Inc., and HEROKU provided by Heroku, Inc. of San Francisco,Calif.

SaaS providers may offer the resources that PaaS provides, includingstorage, networking, servers, virtualization, operating system,middleware, or runtime resources. In some embodiments, SaaS providersmay offer additional resources including, e.g., data and applicationresources. Examples of SaaS include GOOGLE APPS provided by Google Inc.,SALESFORCE provided by Salesforce.com Inc. of San Francisco, Calif., orOFFICE 365 provided by Microsoft Corporation. Examples of SaaS may alsoinclude data storage providers, e.g. Citrix ShareFile® from CitrixSystems, DROPBOX provided by Dropbox, Inc. of San Francisco, Calif.,Microsoft SKYDRIVE provided by Microsoft Corporation, Google Driveprovided by Google Inc., or Apple ICLOUD provided by Apple Inc. ofCupertino, Calif.

Similar to SaaS, DaaS (which is also known as hosted desktop services)is a form of virtual desktop infrastructure (VDI) in which virtualdesktop sessions are typically delivered as a cloud service along withthe apps used on the virtual desktop. Citrix Cloud from Citrix Systemsis one example of a DaaS delivery platform. DaaS delivery platforms maybe hosted on a public cloud computing infrastructure, such as AZURECLOUD from Microsoft Corporation of Redmond, Wash., or AMAZON WEBSERVICES provided by Amazon.com, Inc., of Seattle, Wash., for example.In the case of Citrix Cloud, Citrix Workspace app may be used as asingle-entry point for bringing apps, files and desktops together(whether on-premises or in the cloud) to deliver a unified experience.E. Detailed Description of Example Embodiments of an Intelligent Web APICall Management System

FIG. 5 shows examples of components that may be included within someimplementations of the intelligent Web API call management system 100introduced above in connection with FIG. 1 . In the illustrated example,the web service 102 includes a gateway 502 (e.g., an API gateway) andone or more microservices 504 a, 504 b. As shown, the gateway 502 may belocated between the clients 202 and the microservices 504 and thusprovide a single-point entry for the clients 202 to the microservices504. In particular, the gateway 502 may act as a reverse proxy, routingthe Web API calls 104 from the clients 202 to the microservices 504. Asalso shown, the gateway 502 may additionally aggregate and/or route APIresponses 506 from the microservices 504 to the respective clients 202.

Further, as illustrated, in some implementations, the gateway 502 may beconfigured to send policy information (e.g., the call policy information106 introduced above in Section A) together with the API responses 506that it sends to the respective clients 202, and the clients 202 may beconfigured to return that same call policy information 106 together withthe next Web API calls 104 that they send to the gateway 502 fordistribution to the microservices 504. In some implementations, forexample, the call policy information 106 may be encoded within tokens(e.g., JWTs), certificates, data fragments, etc., that the gateway 502appends to or otherwise includes with the API responses 506 it sends torespective clients 202, and the recipient clients 202 may extract thosetokens, certificates, data fragments, etc., and append them to orotherwise include them the next Web API calls 104 they send to thegateway 502. Including the call policy information 106 with the APIresponses 506 may be beneficial, as it may allow the gateway 502 todynamically change the call policy information 106 after receipt ofindividual Web API calls 104 and adapt the call policy information 106to current load, bandwidth, resource availability, etc.

As noted above, however, it should be appreciated that the call policyinformation 106 may take on any of a number of other forms and/or may beprovided to clients 202 in any of a number of other ways withoutdeparting from the scope of the present disclosure.

FIG. 6 shows an example policy token 602 (e.g., a JWT) that may be usedto covey policy information 106 from the gateway 502 to the clients 202,and vice versa, as well as examples of types of information that may beincluded within the call policy information 106 in some embodiments. Asillustrated, the policy token 602 may include a header 604, a payload606, and a signature 608. In some implementations, the call policyinformation 106 may be included in the header 604 and/or the payload 606(e.g., as one or more claims). Further, in some implementations theheader 604 may specify a signing technique that the gateway 502 uses togenerate the signature 608 based on the content of the header 604 andthe payload 606, as well as a private key. In some implementations, forexample, the specified signing technique may involve (A) combining thebase64url encoded header and the base64url encoded payload, (B) hashingthe combined Based64url value with a hashing technique, e.g., SHA256,and (C) encrypting the determined hash using a private key). As such, byvalidating the signature 608 using the private key and the specifiedsigning technique, the gateway 502 may be able to confirm that the callpolicy information 106 included in the policy tokens 602 it receivesfrom clients 202 has not been altered or otherwise compromised by theclients 202. In some implementations, the header 604 or payload of thepolicy token may additionally include an identifier (e.g., a clientidentifier (ID)) identifying the client 202 to which it is sent, thusenabling the gateway 502 to confirm that a received policy token 602came from the same client 202 that such policy token 602 was originallysent to, thus restricting the transfer of the policy tokens 602 amongstclients 202.

The call policy information 106 that is encoded within the policy tokens602 included with API responses 506, or that is communicated to theclients 202 in some other fashion, may take on any of numerous forms andmay specify any of a number of policies to guide the clients 202 tocollectively distribute the Web API calls 104 they send to the gateway502 over time. As shown in FIG. 6 , for example, in someimplementations, the call policy information 106 may include one or more“next call” policies 610. As the name implies, such next call policies610 may indicate the circumstances under which a client 202 may send itsnext Web API call 104 to the web service 102. The next call policies 610may, for example, allow the recipient client 202 to make its next call(A) during one or more particular time intervals, (B) in response to theoccurrence of particular events (e.g., logon, log off, etc.), and/or (C)during specified time intervals following the occurrence of particularevents, e.g., during a two hour time period after the client 202 logson. More specifically, as illustrated, in some implementations, the nextcall policies 610 may specify one or more “allowed slots” 610 arepresenting particular time intervals (e.g., one hour time slots duringthe following twenty-four hour window) during which the client 202 ispermitted to send its next Web API call 104 to the web service 102and/or one or more “blocked slots” 610 b representing time intervals(e.g., one hour time slots during the following twenty-four hour window)during which the client 202 is prohibited from sending its next Web APIcall 104 to the web service 102.

In some implementations, the allowed slots 610 a specified in the policytoken 602 may additionally indicate (e.g., by being different types ofallowed slots 610 a, such as “reserved slots” versus “preferred slots,”or by way of indicators or flags associated with the allowed slots 610a) a degree of confidence that the web service 102 will have sufficientcapacity to handle a Web API call 104 sent during the indicated timeslot. In some implementations, for example, the individual allowed slots610 a may represent or include either (1) an indication that the Webservice 102 is highly likely to have sufficient capacity to handle a WebAPI call 104 sent during the indicated time slot, such as by beingdesignated as a “reserved slot,” or (2) an indication that the Webservice 102 is uncertain as to whether it will have capacity during theindicated time slot and will simply use its “best efforts” to process aWeb API call 104 sent during the indicate time slot, such as by beingdesignated as a “preferred” slot.

Further, as also illustrated, the next call policies 610 mayadditionally or alternatively include one or more “event triggers” 610 cidentifying events (e.g., logging on, logging off, etc.) after or duringwhich the recipient client 202 may send its next Web API call 104 to theweb service 102, and/or indicating windows of time keyed off orotherwise triggered from such events during which the recipient client202 may send its next Web API call 104 to the web service 102.

With respect to the timing of the Web API calls 104 that are sent byclients 202, the inventors have recognized and appreciated that evenwhen clients 202 need to issue Web API calls 104 with the same frequency(e.g., at least once every twenty-four hours), they are not generallyrequired to maintain the same time interval (e.g., twenty-four hours)between successive Web API calls 104. Clients 202 usually have a graceperiod window and are permitted to space their Web API calls 104anywhere within that grace period window without affecting business. Forexample, if a client 202 that is required to send Web API calls 104 tothe web service 102 at a frequency of one call every twenty-four hours,it may be acceptable for the client 202 to send first and second callsspaced by twenty-one hours followed by a third call after anothertwenty-seven hours. Since the web service 102 may have full knowledge ofthe traffic pattern from the clients 202, the gateway 502 may makeintelligent spacing decisions and guide clients 202 to send Web APIcalls 104 in such way that overloading is prevented and the servercapacity is properly utilized. In some implementations, the clients 202may additionally be configured to randomize the Web API calls they sendwithin allowed slots 610 a so at achieve an even better distribution ofload and help avoid large groups of clients 202 making Web API calls 104at the beginnings of respective time slots.

As also shown in FIG. 6 , in some implementations, the call policyinformation 106 may additionally or alternatively include one or more“retry” policies 612. Once again, as the name implies, the retrypolicies 612 may indicate the conditions under which a client 202 mayre-send Web API calls 104 that were rejected by the gateway 502 (e.g.,for failing to comply with the call policy information 106) and/or wereunsuccessfully processed by one or more microservices 504. Theappropriate use of retry policies 612 may significantly improve theperformance of the system 100 and reduce instances in which clients 202experience a denial of service (DoS) by the web service 102. Even a goodclient 202 may continue to send additional Web API calls 104 in acircumstance where the web service 102 is busy and cannot process aninitial call. This way, a good client may turn into a bad client exactlywhen the service is under stress, contributing negatively to theon-going situation.

A good client may use a back-off technique or approach and the servicecan instruct the client 202 (e.g., via the retry policies 612 in thecall policy information 106) about how to proceed with retries based onthat technique. For example, as shown in FIG. 6 , the retry policies 612may specify a number of retries 612 a that are to be attempted by theback-off technique and/or whether the back-off technique is to perform alinear or exponential back-off 612 b, etc. Further, as also shown, insome implementations, the retry policies 612 may include one or moreerror codes 612 c indicating a reason the initial Web API call 104 wasunsuccessful.

In some implementations, call policy information 106, including theretry policies 612, may also be included with error messages receivedfrom the web service 102 (e.g. an HTTP 500). Such an approach may helpfacilitate the dynamic updating of the next call policies 610 and/orretry policies 612 by the web service 102. In some implementations, forexample, the gateway 502 may dynamically adjust either or both of thosepolicies according to the service status/load/active connections/numberof running threads, etc. The exception here may be when the web service102 does not respond at all to a Web API call 104, such as when one ormore components of the web service 102 are down or the Web API call 104is dropped by the network. In some implementations, the clients 202 maybe configured to follow the previously received retry policies 612 insuch circumstances.

As explained in more detail below, using the call policy information 106noted above, the clients 202 may be able to cooperate with the webservice 102 by collectively distributing their Web API calls 104 overtime, thus minimizing the risk that the web service 102 will beinundated by an unduly large number of Web API calls 104 at the sametime. Further, in some implementations, additional steps may be taken tomake sure the clients 202 actually follow the next call policies 610and/or retry policies 612 that have been communicated to them. Forexample, in some implementations, client compliance may be regulated byencapsulating the call policy information 106 into security entities(e.g., cryptographically signed tokens, certificates, data fragments,etc.) that cannot be tampered with and requesting that the clients 202return such security entities when making respective Web API calls 104.The signed policy token 602 described above in connection with FIG. 6may be used as such a security entity. Benefits of using a signed policytoken 602 such as that describe above include, among other things, thatsuch an entity: (A) cannot be altered by clients 202, (B) can be addedin the request/response headers, so the actual request/response datamodels need not be changed, (C) does no require a lot of resources to begenerated, (D) is easy to validate, (E) does not require shared storagein case of a multi-instance service as the signing keys can be sharedduring service deployment, (F) can be separately from and in additionauthorization and authentication policies to reject bad actors in lowerlayers (e.g., firewalls, middleware, or other components configured toevaluate incoming messages), and (G) is visible to intermediatecomponents like Load balancers or API gateways so traffic/callmanagement can be offloaded to those components. It should beappreciated, however, that in some implementations the policyinformation 106 may be additionally or alternatively be communicated toclients 202 using other formats, such as by way of data fragments (whichmay optionally signed using a private key of the gateway 502) that areincluded within API responses 104 clients, or are otherwise communicatedto and/or made available for access by clients 202.

FIG. 7 shows an example routine 700 that may be employed by one or morecomponents of the system 100, or by one or more components that operatein conjunction with the system 100, to implement certain aspects of thepresent disclosure. In some implementations, for example, one or more,or perhaps all, of the steps of the routine 700 may be implemented bythe gateway 502 shown in FIG. 5 . In other implementations, one or moreof the steps of the routine 700 may additionally or alternatively beperformed by one or more other system components, such as one or moreload balancers (not illustrated) that manage traffic for the web service102 and/or by one or more of the microservices 504. Further, it shouldbe appreciated that, in some implementations, one or more, or perhapsall, of the steps of the routine 700 may additionally or alternativelybe implemented in one or more lower layers of the web service 102, suchas authentication and/or authorization, or perhaps even in one or morehigher layers of the web service 102, such as one or more applicationhandlers. Accordingly, the example implementation described below, inwhich the routine 700 is described as being performed by the gateway502, should not be viewed as limiting with respect to the component(s)that may be used to implement the routine 700. That said, performingsome or all of the steps of the routine 700 by one or more componentsother than the microservice(s) 504, e.g., the gateway 502 and/or one ormore load balancers, may provide certain advantages in someimplementations, as doing so may serve to reduce the complexity of themicroservice(s) 504 and/or may prevent the microservice(s) 504 frombeing burdened with additional tasks.

FIG. 8 shows an example table 800 that may be updated and referenced bythe gateway 502 (and/or by one or more other system components used toimplement the routine 700), as described below, in connection withperforming the routine 700. In some implementations, the gateway 502(and/or one or more other components used to implement the routine 700)may include one or more processors and one or more computer-readablemedia which, when executed by the one or more processors, cause thegateway 502 (and/or the one or more other components used to implementthe routine 700) to perform the actions described herein. In suchimplementations, the table 800 may be embodied by one or more databasesor other storage mediums accessible to the gateway 502 (and/or one ormore other components used to implement the routine 700).

As shown in FIG. 7 , the routine 700 may begin when, at a decision step702, the gateway 502 determines that a new Web API call 104 has beenreceived from a client 202, thus causing the routine 700 to proceed to adecision step 704.

At the decision step 704 of the routine 700, the gateway 502 maydetermine whether the received Web API call 104 includes call policyinformation 106. In some implementations, for example, the gateway 502may determine whether the Web API call 104 includes a policy token 602of the type described above in connection with FIG. 6 .

When, at the decision step 704 of the routine 700, the gateway 502determines that the received Web API call 104 includes call policyinformation 106, the routine 700 may proceed to a decision step 706, atwhich the gateway 502 may determine whether the call policy information106 is authentic, as described below. When, on the other hand, thegateway 502 determines (at the decision step 704) that the Web API call104 does not include call policy information 106, the routine 700 mayinstead proceed to a decision step 710, at which the gateway 502 maydetermine whether to treat the failure to include call policyinformation 106 in the received Web API call 104 with leniency, i.e., tooverlook a reason for non-compliance. As noted below, in the event thatthe gateway 502 determines (at the decision step 710) to grant leniencyto a non-compliant Web API call 104, the routine 700 may proceed to thestep 714 at which the gateway 502 may forward the Web API call 104 to anappropriate service (e.g., one of the microservices 504) for processing.Examples of circumstances in which the gateway 502 may determine togrant leniency to a non-compliant Web API call 104 are set forth below.In a circumstance in which the gateway 502 determined (at the decisionstep 704) that the Web API call 104 lacked call policy information 106,the gateway 502 may determine (at the decision step 710) to grantleniency to the client 202 because, for example, the Web API call 104may have been the initial Web API call 104 the client 202 has made tothe web service 102 and the Web API call 104 may lack a policy token 602simply because that client 202 has not yet been provided with a policytoken 602 (e.g., together with a prior API response 506) to include inthat call.

The gateway 502 may additionally or alternatively determine to grantleniency to a client device 202 for a non-compliant Web API call 104(either for failure to include call policy information 106 in a Web APIcall 104 per the decision step 704, or for other reasons fornon-compliance, e.g., as discussed below in connection with decisionsteps 706 and 708) in any of a number of other circumstances. Forinstance, in some implementations, the gateway 502 may allow respectiveclients 202 a certain number of non-compliant calls before determiningto reject the Web API call 104 (per a step 712—as described below)rather than passing it along to an appropriate service, e.g., one of themicroservices 504 shown in FIG. 5 , for processing (per a step 714—asdescribed below). The gateway 502 may thus determine (at the decisionstep 710) to grant leniency to a client 202 if that client has not netexceeded its allotted number of non-compliant calls.

As another example, the gateway 502 may additionally or alternativelytake into account the current expected load on the web service 102 whendetermining (at the decision step 710) whether to grant leniency withrespect to a particular non-compliant Web API call 104. With referenceto FIG. 8 (described in more detail below), for example, the gateway 502may consult the table 800 to compare a value of a “scheduled count”entry 810 for the current time slot (e.g., as indicated by an arrow 802in FIG. 8 ) with an estimated number of calls that are expected to occurwithin the current time slot (e.g., by referencing one or morehistorical “actual count” entries 808 for one or more comparable timeslots, such as the same slot on the same day of the week for one or moreprior weeks) to determine whether the capacity of the web service 102 islikely to be exceeded if the non-compliant call is allowed to beprocessed. If the gateway 502 determines that a particular non-compliantWeb API call 104 is unlikely to have an adverse impact on clients 202making compliant calls (e.g., because the value of the “scheduled count”entry 810 for the current time slot is more than a threshold amountbelow the estimated number of calls that are expected to occur withinthe current time slot), the gateway 502 may determine (at the decisionstep 710) to grant leniency to the Web API call 104 currently underconsideration. Further, in some implementations, the gateway 502 mayadditionally or alternatively take into account a priority statusassigned to the client 202 that issued the non-compliant call whendetermining (at the decision step 710) whether to grant leniency withrespect to the call currently under consideration, such thathigher-priority clients 202 are more likely to be granted leniency thanlower priority clients. Any of a number of additional or differentcriteria for determining to whether to grant leniency for anon-compliant Web API call 104 (per the decision step 710) are alsopossible and contemplated.

As noted previously, at the decision step 706 of the routine 700, thegateway 502 may determine whether the call policy information 106included with a received Web API call 104 is authentic. In someimplementations, for example, the call policy information 106 may beencoded within the header 604 and/or payload 606 a policy token 602(e.g., a JWT) included with the received Web API call 104, and thegateway 502 may use its private key and the signing technique identifiedin the header 604 to process the content of the header 604 and payload606 to determine whether the signature 608 is valid. In someimplementations, for example, the gateway 502 may calculate a signatureby (A) combining the Base64url encoded header and the Base64url encodedpayload, (B) hashing the combined Based64url value with a hashingtechnique, e.g., SHA256, and (C) encrypting the determined hash usingthe private key). If the calculated signature matches the signature 608of the token 602, the gateway 502 may determine the signature 608 isvalid. In such implementations, the gateway 502 may thus determine thatthe call policy information 106 is authentic based on the presence of avalid signature 608 in the received policy token 602. As noted above,the existence of a valid signature 608 may give the gateway 502confidence that the header 604 and payload 606 have not been alteredsince the gateway 502 originally generated and signed the policy token602. Example techniques for generating new policy tokens 602 fordistribution to clients are described below in connection with steps720, 726, and 728 of the routine 700.

When, at the decision step 706 of the routine 700, the gateway 502determines that the call policy information 106 is not authentic(because a signature 608 is absent or determined to be invalid, or forsome other reason), the routine 700 may proceed to the decision step710, at which the gateway 502 may determine whether to grant leniencyfor the non-compliant Web API call 104, for any of the reasons notedabove or otherwise. When, on the other hand, the gateway 502 determines(at the decision step 706) that the call policy information 106 isauthentic (by validating the signature 608 in a policy token 602 orotherwise), the routine 700 may instead proceed to the decision step708, at which the gateway 502 may determine whether the Web API call 104is compliant with the policy or policies indicated in the call policyinformation 106.

As noted above, the call policy information 106 can identify any of anumber criteria that a Web API call 104 must meet in order to be deemed“compliant.” With reference to FIG. 6 , for example, in someimplementations, Web API call 104 may be deemed “non-compliant” if thegateway 502 determines that (A) the Web API call 104 was not issuedwithin an allowed slot 610 a or in connection with indicated eventtrigger 610 c, and/or (B) the Web API call 104 was issued within ablocked slot 610 b.

When, at the decision step 708 of the routine 700, the gateway 502determines that the Web API call 104 does not comply with the callpolicy information 106, the routine 700 may proceed to the decision step710, at which the gateway 502 may determine whether to grant leniencyfor the non-compliant Web API call 104, for any of the reasons notedabove or otherwise. When, on the other hand, the gateway 502 determines(at the decision step 708) that the Web API call 104 does comply withthe call policy information 106, the routine 700 may instead proceed tothe step 714 at which the gateway 502 may forward the Web API call 104to an appropriate service (e.g., one of the microservices 504) forprocessing.

At a decision step 716 of the routine 700, the gateway 502 may determinewhether the Web API call 104 was successfully processed by the serviceto which it was forwarded (per the step 714). With reference to FIG. 5 ,such a determination may be made, for example, based on the API response506 (which may include an error message) that the gateway 502 receivesfrom the microservice 504 to which it sent the Web API call 104.

When, at the decision step 716 of the routine 700, the gateway 502determines that the Web API call 104 was successfully handled by theservice to which it was forwarded, the routine 700 may proceed to a step718, at which the gateway 502 may increment the value of the “actualcount” entry 808 in the table 800 (shown in FIG. 8 ) for the currenttime slot (as indicated by the arrow 802 in FIG. 8 ) by “1”. As shown inFIG. 8 , in some implementations, individual time slots in the table 800may be indicated by a “slot start date & time” entry 804 and a “slot enddate & time” entry 806, which identify the date/time at which theindicated time slot begins and the date/time at which the indicated timeslot ends, respectively. The table 800 may maintain historical recordsof the “actual count” entries 808 for any desired period of time in thepast.

In some implementations, the values of the “actual count” entries 808may allow the gateway 502 to determine how to best scale the web service102 to meet expected demands during corresponding time slots in thefuture. For example, in some implementations, the web service 102 mayuse the values of the historical “actual count” entries 808 toproactively determine how to provision the capacity of the web service102 for future time periods, such as by increasing the capacity duringbusiness hours where the “actual count” entries 808 indicate a tendencyfor higher loads and decreasing the capacity during non-business hourswhere the “actual count” entries 808 indicate a tendency for lowerloads. Further, in some implementations, the values of the “actualcount” entries 808 may additionally or alternatively allow the gateway502 to determine how the system is likely to be scaled during particulartime slots for purposes of determining how best to set the call policyinformation 106, as explained in more detail below.

At a step 720 of the routine 700, the gateway 502 may generate new callpolicy information 106 for the next Web API call 104 the client 202 isto make to the web service 102. In some implementations, generating suchpolicy information may involve constructing a new policy token 602 toappend to the API response 506 that is to be sent to the client 202, asillustrated in FIG. 6 . In some implementations, the new call policyinformation 106 may be determined based on the data values in in thetable 800 (shown in FIG. 8 ), as explained in more detail below.

At a step 722 of the routine 700, the gateway 502 may update the valuesof the “scheduled count” entries 810 in the table 800 based on the newcall policy information 106 determined at the step 720 (or at steps 726or 728, as described below). In some implementations, such updates tothe “scheduled count” entries 810 may be made to reflect a prediction asto when the client 202 is likely to send its next Web API call 104 basedon the call policy information 106 that was determined at the step 720.In some circumstances, the gateway 502 may determine to increment thevalue of a “scheduled count” entry 810 for only a single time slot inthe future. For instance, if the new call policy information 106determined at the step 720 includes only a single allowed slot 610 a,then the gateway 502 may decide to increment the value of the “scheduledcount” entry 810 for just that one allowed slot 610 a by “1”. In othercircumstances, the gateway 502 may instead determine to increment thevalues of “scheduled count” entry 810 for multiple future time slotsbased on the new call policy information 106 determined at the step 720.For example, if the new call policy information 106 determined at thestep 720 includes two allowed slots 610 a, then the gateway 502 maydetermine to update the values of the “scheduled count” entries 810 foreach of those two time slots by “0.5”. As another example, if the newcall policy information 106 determined at the step 720 identifies threeof twenty-four possible time slots for the client device 202 to send itsnext Web API call 104 as blocked slots 610 b, the gateway 502 maydetermine to update the values of the “scheduled count” entries 810 forthe remaining twenty-one possible time slots by “1/21”. Further, asdiscussed in more detail below, in some implementations, the gateway 502may determine (at the step 720) to assign different weights to differentallowed slots 610 a that it includes in new call policy information 106.In such circumstances, at the step 722, the gateway 502 may applycorresponding weights to the values by which the “scheduled count”entries 810 are updated, so as to accurately reflect the likelihood thatthe recipient client 202 is going to select particular time slots forsending its next Web API call 104.

In any event, referring back to the step 720, in some implementations,the new call policy information 106 the gateway 502 determines for theclient 202 at that step may depend on the current values of the“scheduled count” entries 810 at the time the gateway 502 makes such adetermination. As noted previously, the current values of the “scheduledcount” entries 810 in the table 800 may reflect estimates of the numberof Web API calls 104 the web service 102 will receive from clients 202during the respective time slots that follow the current time slot(e.g., as indicated by the arrow 802). As the gateway 502 issues newcall policy information 106 directing, or at least urging, respectiveclients 202 to issue their next Web API calls 104 during particularfuture time slots, the gateway 502 may adjust those estimatesaccordingly (as discussed above in connection with the step 722). Assuch, the gateway 502 may, at the step 720, generate new call policyinformation 106 that directs, or at least urges, the recipient client202 to utilize one or more future time slots that are estimated to beless busy than the others.

In some implementations, when determining the new call policyinformation 106 for a client at the step 720, the gateway 502 mayfurther take into account (A) any grace period that client 202 isallotted for satisfying a periodic Web API call requirement, if such arequirement exists, and/or (B) the values of historical “actual count”entries 808, as those values may enable the gateway 502 to determine howthe system is likely to be scaled during particular time slots forpurposes of determining how best to set the call policy information 106.For example, in some implementations, when determining the call policydata 106 in a given instance, the gateway 502 may refrain fromdetermining call policy data 106 that would raise the “scheduled count”entry 810 for a particular time slot beyond the expected scaled capacityof the web service 102 during that time slot. In some implementations,the gateway 502 may further determine the call policy data 106 so as toensure that a “buffer value” remains between the value of a “scheduledcount” entry 810 for a particular time slot and the expected scaledcapacity of the web service 102 during that time slot. Leaving such abuffer value may, for example, provide the gateway 502 with sufficientleeway to provide leniency for at least certain types of non-compliantWeb API calls 104 (e.g., per the decision step 710), such as for Web APIcalls 104 that are determined (e.g., per the decision step 704) not toinclude call policy information 106 because the clients 202 are makingWeb API calls 104 for the first time, or otherwise.

Any of a number of technique and/or criteria may be utilized by thegateway 502 to optimize the call policy information 106 (determined atthe step 720) depending on the current values of the “scheduled count”entries 810, and possibly other information (as noted above) that isaccessible to the gateway 502. In some implementations, for example, thegateway 502 may identify a number of allowed slots 610 a correspondingto future time slots that are estimated to be less busy than others, andmay additionally apply weights to those allowed slots 610 a inaccordance to the relative values of the “scheduled count” entries 810for those time slots. By employing such an approach, the gateway 502 may“nudge” individual clients into selecting the least busy time slot in agroup but also allow those clients 202 leeway to select an alternatetime slot that, although not ideal, would be suitable for the webservice 102.

In some implementations, the new call policy information 106 the gateway502 generates at the step 720 may additionally include one or more retrypolicies 612, so that the recipient client 202 knows how best to handleretry attempts, as discussed above, in the event the next Web API call104 it attempts is unsuccessful.

At a step 724 of the routine 700, the gateway 502 may send the new callpolicy information 106 (determined at the step 720, or at the step 726or 728, as discussed below) to the client device 202 from which the WebAPI call 104 was received at the decision step 702. For example, inimplementations in which the call policy information 106 is encoded in apolicy token 602, the generated policy token 602 may be appended to theAPI response 506 that the gateway 502 sends back to the client 202.

Referring again to the decision step 716 of the routine 700, when thegateway 502 determines (at the decision step 716) that the Web API call104 was not successfully handled by the service to which it wasforwarded, the routine 700 may proceed to a step 726, at which thegateway 502 may generate new call policy information 106 so that therecipient client 202 can be provided with one or more new next callpolicies 610 and/or one or more new retry policies 612 to mandate, or atleast urge, the client 202 to make retry attempts during one or moreparticular time slots and/or employ a retry technique in a particularfashion, such as by specifying a maximum number of retry attempts,specifying whether to use a linear or exponential back-off technique,etc. The processes and information used to generate the new call policyinformation 106 at the step 726 may be essentially the same as thosedescribed above in connection with the step 720. The primary differencebetween the steps 726 and 720 is that the new call policy information106 determined at the step 726 may be specifically configured to guidethe client 202 to make a successful retry attempt after the web service102 was unable to successfully process a Web API call 104, while at thesame time maintaining a balanced load for the web service 102.

In the event that the gateway 502 rejects a Web API call 104 (at thestep 712) after the gateway 502 has determined not to grant leniency tothe client 202 (per the decision step 710) for a non-compliant Web APIcall 104, the routine 700 may proceed to a step 728, at which at whichthe gateway 502 may generate new call policy information 106 so that therecipient client 202 can be provided with one or more new next callpolicies 610 and/or one or more new retry policies 612 to mandate, or atleast urge, the client 202 to make retry attempts during one or moreparticular time slots and/or employ a retry technique in a particularfashion, such as by specifying a maximum number of retry attempts,specifying whether to use a linear or exponential back-off technique,etc. The processes and information used to generate the new call policyinformation 106 at the step 728 may be essentially the same as thosedescribed above in connection with the steps 720 and 726. The primarydifference between the steps 728 and the steps 720 and 726 is that thenew call policy information 106 determined at the step 728 may bespecifically configured to guide the client 202 to make a successfulretry attempt after a Web API call has been rejected by the gateway 502for non-compliance with one or more policies while at the same timemaintaining a balanced load for the web service 102.

In some implementations, the implementation of the steps 722 and 724following either of the steps 726 and 728 may be the same as or similarto the implementation of those steps following the step 720. The newcall policy information 106 processed by the steps 722 and 724 may,however, be different when that new call policy information 106 isgenerated by the respective steps 720, 726, 728. Further, as indicatedin the text of the step 724, in some circumstances, e.g., when a call isrejected by the gateway 502 per the step 712, the new call policyinformation 106 may be included as part of an error message, rather thanas part of an API response 506, in some implementations.

A illustrative example of a scenario in which and intelligent Web APIcall management system 100, such as described herein, may be employedwill now be described. Consider a server which wants to uniformlydistribute the requests (e.g., Web API calls 104) it receives fromclients over a twenty-four hour time period. Depending on theprovisioned capacity, the server may determine the optimal number ofrequests it can serve in a one hour window. On the server side, a firstdata structure (e.g., “scheduled count” entries 810 of the table 800)may keep the call schedule for a future day. The schedule may be dividedinto twenty-four slots (keyed by the hour) and the value for each suchslot may be a numerical count initialized to zero. The server may alsomaintain a second, similar data structure (e.g., the “actual count”entries 808 of the table 800) to track the requests that are actuallybeing processed as the day progresses. For every request successfullyprocessed, the count for the corresponding slot in the second datastructure may be incremented. For a dropped request, the count for thecorresponding slot in the second data structure is not incremented. Thissecond data structure may serve the purpose of knowing the historicalpattern of requests and identifying patterns (peak periods and leanperiods).

For every request successfully processed, the server may include a token(e.g., a policy token 602) in the header or payload of the response,e.g., an API response 506. The token may identify an allowed slot (e.g.,an allowed slot 610 a indicated in a policy token 602) for the futuredate during which the next request can be issued from the client and/orone or more blocked slots (e.g., one or more blocked slots 610 bindicated in a policy token 602) which are to be avoided. These slotsmay calculated based on the first data structure (representing theschedule for the future day) maintained by the server, the grace windowallowed for the client, and the second data structure (representing thehistorical traffic pattern of requests seen by the server). Upon issuingthe token, the server may modify the first data structure (representingthe schedule for the future day) to increment the count for the slot(s)assigned to the client in the token. The token may indicate the slotsassigned by the server and may be signed to avoid tampering. The tokenmay also include a client identifier for making it non-transferable toother client devices 202. The protocol that the client device 202 isexpected to follow to comply with the time slots designated by theserver may be indicated in the token. While making the next call, theclient may follow the schedule and may include the token in the call tothe server. The server may evaluate the content of received token anddetermine whether the client has violated the guidance or complied withit. The server may honor the calls which are made within the allowedslots and/or drop the ones which are sent during blocked slots. Theserver may also choose to be lenient for a few requests and becomestricter if the non-compliance pattern continues.

For requests the server can't process due to current load, the servermay consult the near-term schedule (e.g., as represented by the firstdata structure) within the grace window of the caller and provide aretry token with an allowed slot which has capacity, and/or with one ormore blocked slots marked so that the client can do a smart retry andthus increase its chances of success.

With such an approach, clients may be guided by the server to generate atraffic pattern which is uniformly distributed, avoids spikes/denial ofservice and at the same time make best use of server resources withoutaffecting the business capability. Further, in some suchimplementations, the server need only know the count of the calls it hasscheduled in a future slot (e.g., as represented by the first datastructure) and not the exact clients scheduled therein.

While the foregoing example references one hour slots, any othersuitable time interval may instead be employed. The same method may workfor finer time slots (minutes, seconds etc.) as well as coarser ones,such as days. Similarly, the server may identify multiple allowed slotswith different weightages for the client to choose from to give moreflexibility to the client but at the same time centrally shape thetraffic pattern.

F. Example Implementations of Methods, Systems, and Computer-ReadableMedia in Accordance with the Present Disclosure

The following paragraphs (M1) through (M29) describe examples of methodsthat may be implemented in accordance with the present disclosure.

(M1) A method may be performed that involves sending, from a computingsystem to a client device, a first token that includes first dataindicating that the client device is authorized to send an applicationprogramming interface (API) call to the computing system during at leasta first time slot, the first token including a first signature based onthe first data and a private key of the computing system; receiving, bythe computing system and from the client device, a first API call duringthe first time slot; determining, by the computing system, that thefirst API call includes a second token, the second token includingsecond data and a second signature; determining, by the computing systemand using the private key and the second data, that the second signatureis valid; determining, by the computing system, that the second dataindicates that the client device was authorized to send the first APIcall to the computing system during the first time slot; and processing,by the computing system, the first API call based at least in part thesecond data indicating that the client device was authorized to send thefirst API call to the computing system during the first time slot.

(M2) A method may be performed as described in paragraph (M1), and mayfurther involve receiving, by the computing system and from the clientdevice, a second API call prior to receiving the first API call;including, by the computing system, the first token in a response to thesecond API call; and sending, by the computing system, the response tothe client device.

(M3) A method may be performed as described in paragraph (M1) orparagraph (M2), and may further involve generating, by the computingsystem, the first data based at least in part on stored data that isindicative of a first estimated capacity of the computing system toprocess API calls during the first time slot.

(M4) A method may be performed that involves sending, from a computingsystem to a client device, first data indicating that the client deviceis authorized to send an application programming interface (API) call tothe computing system during at least a first time slot; receiving, bythe computing system and from the client device, a first API call duringthe first time slot; and processing, by the computing system, the firstAPI call.

(M5) A method may be performed as described in paragraph (M4), and mayfurther involve receiving, by the computing system and from the clientdevice, a second API call prior to receiving the first API call;including, by the computing system, the first data in a response to thesecond API call; and sending, by the computing system, the response tothe client device.

(M6) A method may be performed as described in paragraph (M4) orparagraph (M5), and may further involve determining, by the computingsystem, that the first API call was received from the client deviceduring the first time slot; and determining, by the computing system, toprocess the first API call based at least in part on the first API callhaving been received during the first time slot.

(M7) A method may be performed as described in paragraph (M6), and mayfurther involve determining, by the computing system, that the first APIcall includes second data indicative of the client device having beenauthorized to send the first API call to the computing system during thefirst time slot; wherein determining to process the first API call maybe further based at least on part on the second data being indicative ofthe client device having been authorized to send the first API call tothe computing system during the first time slot.

(M8) A method may be performed as described in paragraph (M7), and mayfurther involve determining, by computing system, that the second datamatches the first data; wherein determining to process the first APIcall may be further based at least on part on the second data matchingthe first data.

(M9) A method may be performed as described in any of paragraphs (M4)through (M8), and may further involve generating, by the computingsystem, a token that represents the first data and includes a firstsignature based on the first data and a private key of the computingsystem; wherein the computing system may send the first data to theclient device as a part of the token.

(M10) A method may be performed as described in any of paragraphs (M4)through (M9), and may further involve determining, by the computingsystem, that the first API call includes a token; and determining thatcontent of the token indicates that the client device has beenauthorized to send the first API call to the computing system during thefirst time slot; wherein determining to process the first API call maybe further based at least on part on the content of the token indicatingthat the client device has been authorized to send the first API call tothe computing system during the first time slot.

(M11) A method may be performed as described in paragraph (M10), and mayfurther involve determining, by the computing system and using a privatekey of the computing system and content of the token, that a signatureof the token is valid; wherein determining to process the first API callmay be further based at least on part on the signature of the tokenbeing valid.

(M12) A method may be performed as described in any of paragraphs (M4)through (M11), and may further involve generating, by the computingsystem, the first data based at least in part on stored data that isindicative of a first estimated capacity of the computing system toprocess API calls during the first time slot.

(M13) A method may be performed as described in paragraph (M12), and mayfurther involve updating the stored data to indicate a reducedavailability of the first time slot to service API calls in response togenerating the first data.

(M14) A method may be performed as described in paragraph (M12) orparagraph (M13), wherein the stored data may be further indicative of asecond estimated capacity of the computing system to process API callsduring a second time slot, and the method may further involvedetermining, by the computing system, that the first estimated capacityis greater than the second estimated capacity; and configuring, by thecomputing system and based at least on the first estimated capacitybeing greater than the second estimated capacity, the first data toindicate that the client device is authorized to send an API call to thecomputing system during the first time slot rather than the second timeslot.

(M15) A method may be performed as described in any of paragraphs (M4)through (M14), and may further involve determining a frequency withwhich the client device is to make API calls; determining a grace periodfor the client device to make web API calls with the frequency; andgenerating, by the computing system, the first data based at least inpart on the frequency and the grace period.

(M16) A method may be performed as described in any of paragraphs (M4)through (M14), wherein the first data may further indicate how theclient device is to go about sending an additional API call if anattempted API call fails.

(M17) A method may be performed that involves sending, from a computingsystem to a client device, first data indicating that the client deviceis prohibited from sending an API call to the computing system during atleast a first time slot; receiving, by the computing system and from theclient device, a first API call during a second time slot that isdifferent than the first time slot; and processing, by the computingsystem, the first API call.

(M18) A method may be performed as described in paragraph (M17), and mayfurther involve receiving, by the computing system and from the clientdevice, a second API call prior to receiving the first API call;including, by the computing system, the first data in a response to thesecond API call; and sending, by the computing system, the response tothe client device.

(M19) A method may be performed as described in paragraph (M17) orparagraph (M18), and may further involve determining, by the computingsystem, that the first API call was received from the client deviceoutside of the first time slot; and determining, by the computingsystem, to process the first API call based at least in part on thefirst API call having been received outside of the first time slot.

(M20) A method may be performed as described in paragraph (M19), and mayfurther involve determining, by the computing system, that the first APIcall includes second data indicative of the client device having beenprohibited from sending an API call to the computing system during thefirst time slot but not having been prohibited from sending an API callto the computing system during the second time slot; wherein determiningto process the first API call may be further based at least on part onthe second data being indicative of the client device not having beenprohibited from sending an API call to the computing system during thesecond time slot.

(M21) A method may be performed as described in paragraph (M20), and mayfurther involve determining, by computing system, that the second datamatches the first data; wherein determining to process the first APIcall may be further based at least on part on the second data matchingthe first data.

(M22) A method may be performed as described in any of paragraphs (M17)through (M21), and may further involve generating, by the computingsystem, a token that represents the first data and includes a firstsignature based on the first data and a private key of the computingsystem; wherein the computing system may send the first data to theclient device as a part of the token.

(M23) A method may be performed as described in any of paragraphs (M17)through (M22), and may further involve determining, by the computingsystem, that the first API call includes a token; and determining thatcontent of the token indicates that the client device has beenprohibited from sending an API call to the computing system during thefirst time slot but has not been prohibited from sending an API call tothe computing system during the second time slot; wherein determining toprocess the first API call may be further based at least on part on thecontent of the token indicating that the client device has not beenprohibited from sending an API call to the computing system during thesecond time slot.

(M24) A method may be performed as described in paragraph (M23), and mayfurther involve determining, by the computing system and using a privatekey of the computing system and content of the token, that a signatureof the token is valid; wherein determining to process the first API callmay be further based at least on part on the signature of the tokenbeing valid.

(M25) A method may be performed as described in any of paragraphs (M17)through (M24), and may further involve generating, by the computingsystem, the first data based at least in part on stored data that isindicative of a first estimated capacity of the computing system toprocess API calls during the first time slot.

(M26) A method may be performed as described in paragraph (M25), and mayfurther involve updating the stored data to indicate a reducedavailability of the second time slot to service API calls in response togenerating the first data.

(M27) A method may be performed as described in paragraph (M25) orparagraph (M26), wherein the stored data may be further indicative of asecond estimated capacity of the computing system to process API callsduring the second time slot, and the method may further involvedetermining, by the computing system, that the first estimated capacityis lower than the second estimated capacity; and configuring, by thecomputing system and based at least on the first estimated capacitybeing lower than the second estimated capacity, the first data toindicate that the client device is prohibited from sending an API callto the computing system during the first time slot but is not prohibitedfrom sending an API call to the computing system during the second timeslot.

(M28) A method may be performed as described in any of paragraphs (M17)through (M27), and may further involve determining a frequency withwhich the client device is to make API calls; determining a grace periodfor the client device to make web API calls with the frequency; andgenerating, by the computing system, the first data based at least inpart on the frequency and the grace period.

(M29) A method may be performed as described in any of paragraphs (M17)through (M28), wherein the first data may further indicate how theclient device is to go about sending an additional API call if anattempted API call fails.

The following paragraphs (S1) through (S29) describe examples of systemsand devices that may be implemented in accordance with the presentdisclosure.

(S1) A system may comprise at least one processor and at least onecomputer-readable medium encoded with instructions which, when executedby the at least one processor, cause the system to send to a clientdevice, a first token that includes first data indicating that theclient device is authorized to send an application programming interface(API) call to the system during at least a first time slot, the firsttoken including a first signature based on the first data and a privatekey of the system, to receive from the client device, a first API callduring the first time slot, to determine that the first API callincludes a second token, the second token including second data and asecond signature, to determine, using the private key and the seconddata, that the second signature is valid, to determine that the seconddata indicates that the client device was authorized to send the firstAPI call to the system during the first time slot, and to process thefirst API call based at least in part the second data indicating thatthe client device was authorized to send the first API call to thesystem during the first time slot.

(S2) A system may be configured as described in paragraph (S1), and theat least one computer-readable medium may be further encoded withadditional instructions which, when executed by the at least oneprocessor, further cause the system to receive, from the client device,a second API call prior to receiving the first API call, to include thefirst token in a response to the second API call, and to send theresponse to the client device.

(S3) A system may be configured as described in paragraph (S1) orparagraph (S2), and the at least one computer-readable medium may befurther encoded with additional instructions which, when executed by theat least one processor, further cause the system to generate the firstdata based at least in part on stored data that is indicative of a firstestimated capacity of the system to process API calls during the firsttime slot.

(S4) A system may comprise at least one processor and at least onecomputer-readable medium encoded with instructions which, when executedby the at least one processor, cause the system to send, to a clientdevice, first data indicating that the client device is authorized tosend an application programming interface (API) call to the systemduring at least a first time slot, to receive, from the client device, afirst API call during the first time slot, and to process the first APIcall.

(S5) A system may be configured as described in paragraph (S4), and theat least one computer-readable medium may be further encoded withadditional instructions which, when executed by the at least oneprocessor, further cause the system to receive, from the client device,a second API call prior to receiving the first API call, to include thefirst data in a response to the second API call, and to send theresponse to the client device.

(S6) A system may be configured as described in paragraph (S4) orparagraph (S5), and the at least one computer-readable medium may befurther encoded with additional instructions which, when executed by theat least one processor, further cause the system to determine that thefirst API call was received from the client device during the first timeslot, and to determine to process the first API call based at least inpart on the first API call having been received during the first timeslot.

(S7) A system may be configured as described in paragraph (S6), and theat least one computer-readable medium may be further encoded withadditional instructions which, when executed by the at least oneprocessor, further cause the system to determine that the first API callincludes second data indicative of the client device having beenauthorized to send the first API call to the system during the firsttime slot, and to determine to process the first API call further basedat least on part on the second data being indicative of the clientdevice having been authorized to send the first API call to the systemduring the first time slot.

(S8) A system may be configured as described in paragraph (S7), and theat least one computer-readable medium may be further encoded withadditional instructions which, when executed by the at least oneprocessor, further cause the system to determine that the second datamatches the first data, and to determine to process the first API callfurther based at least on part on the second data matching the firstdata.

(S9) A system may be configured as described in any of paragraphs (S4)through (S8), and the at least one computer-readable medium may befurther encoded with additional instructions which, when executed by theat least one processor, further cause the system to generate a tokenthat represents the first data and includes a first signature based onthe first data and a private key of the system, and to send the firstdata to the client device as a part of the token.

(S10) A system may be configured as described in any of paragraphs (S4)through (S9), and the at least one computer-readable medium may befurther encoded with additional instructions which, when executed by theat least one processor, further cause the system to determine that thefirst API call includes a token, to determine that content of the tokenindicates that the client device has been authorized to send the firstAPI call to the system during the first time slot, and to determine toprocess the first API call further based at least on part on the contentof the token indicating that the client device has been authorized tosend the first API call to the system during the first time slot.

(S11) A system may be configured as described in paragraph (S10), andthe at least one computer-readable medium may be further encoded withadditional instructions which, when executed by the at least oneprocessor, further cause the system to determine, using a private key ofthe system and content of the token, that a signature of the token isvalid, and to determine to process the first API call further based atleast on part on the signature of the token being valid.

(S12) A system may be configured as described in any of paragraphs (S4)through (S11), and the at least one computer-readable medium may befurther encoded with additional instructions which, when executed by theat least one processor, further cause the system to generate the firstdata based at least in part on stored data that is indicative of a firstestimated capacity of the system to process API calls during the firsttime slot.

(S13) A system may be configured as described in paragraph (S12), andthe at least one computer-readable medium may be further encoded withadditional instructions which, when executed by the at least oneprocessor, further cause the system to update the stored data toindicate a reduced availability of the first time slot to service APIcalls in response to generating the first data.

(S14) A system may be configured as described in paragraph (S12) orparagraph (S13), wherein the stored data may be further indicative of asecond estimated capacity of the system to process API calls during asecond time slot, and the at least one computer-readable medium may befurther encoded with additional instructions which, when executed by theat least one processor, further cause the system to determine that thefirst estimated capacity is greater than the second estimated capacity,and to configure the first data to indicate that the client device isauthorized to send an API call to the system during the first time slotrather than the second time slot based at least on the first estimatedcapacity being greater than the second estimated capacity.

(S15) A system may be configured as described in any of paragraphs (S4)through (S14), and the at least one computer-readable medium may befurther encoded with additional instructions which, when executed by theat least one processor, further cause the system to determine afrequency with which the client device is to make API calls, todetermine a grace period for the client device to make web API callswith the frequency, and to generate the first data based at least inpart on the frequency and the grace period.

(S16) A system may be configured as described in any of paragraphs (S4)through (S14), wherein the first data may further indicate how theclient device is to go about sending an additional API call if anattempted API call fails.

(S17) A system may comprise at least one processor and at least onecomputer-readable medium encoded with instructions which, when executedby the at least one processor, cause the system to send, to a clientdevice, first data indicating that the client device is prohibited fromsending an API call to the system during at least a first time slot, toreceive from the client device, a first API call during a second timeslot that is different than the first time slot, and to process thefirst API call.

(S18) A system may be configured as described in paragraph (S17), andthe at least one computer-readable medium may be further encoded withadditional instructions which, when executed by the at least oneprocessor, further cause the system to receive, from the client device,a second API call prior to receiving the first API call, to include thefirst data in a response to the second API call, and to send theresponse to the client device.

(S19) A system may be configured as described in paragraph (S17) orparagraph (S18), and the at least one computer-readable medium may befurther encoded with additional instructions which, when executed by theat least one processor, further cause the system to determine that thefirst API call was received from the client device outside of the firsttime slot, and to determine to process the first API call based at leastin part on the first API call having been received outside of the firsttime slot.

(S20) A system may be configured as described in paragraph (S19), andthe at least one computer-readable medium may be further encoded withadditional instructions which, when executed by the at least oneprocessor, further cause the system to determine that the first API callincludes second data indicative of the client device having beenprohibited from sending an API call to the system during the first timeslot but not having been prohibited from sending an API call to thesystem during the second time slot, and to determine to process thefirst API call further based at least on part on the second data beingindicative of the client device not having been prohibited from sendingan API call to the system during the second time slot.

(S21) A system may be configured as described in paragraph (S20), andthe at least one computer-readable medium may be further encoded withadditional instructions which, when executed by the at least oneprocessor, further cause the system to determine that the second datamatches the first data, and to determine to process the first API callfurther based at least on part on the second data matching the firstdata.

(S22) A system may be configured as described in any of paragraphs (S17)through (S21), and the at least one computer-readable medium may befurther encoded with additional instructions which, when executed by theat least one processor, further cause the system to generate a tokenthat represents the first data and includes a first signature based onthe first data and a private key of the system, and to send the firstdata to the client device as a part of the token.

(S23) A system may be configured as described in any of paragraphs (S17)through (S22), and the at least one computer-readable medium may befurther encoded with additional instructions which, when executed by theat least one processor, further cause the system to determine that thefirst API call includes a token, to determine that content of the tokenindicates that the client device has been prohibited from sending an APIcall to the system during the first time slot but has not beenprohibited from sending an API call to the system during the second timeslot, and to determine to process the first API call may be furtherbased at least on part on the content of the token indicating that theclient device has not been prohibited from sending an API call to thesystem during the second time slot.

(S24) A system may be configured as described in paragraph (S23), andthe at least one computer-readable medium may be further encoded withadditional instructions which, when executed by the at least oneprocessor, further cause the system to determine, using a private key ofthe system and content of the token, that a signature of the token isvalid, and to determine to process the first API call further based atleast on part on the signature of the token being valid.

(S25) A system may be configured as described in any of paragraphs (S17)through (S24), and the at least one computer-readable medium may befurther encoded with additional instructions which, when executed by theat least one processor, further cause the system to generate the firstdata based at least in part on stored data that is indicative of a firstestimated capacity of the system to process API calls during the firsttime slot.

(S26) A system may be configured as described in paragraph (S25), andthe at least one computer-readable medium may be further encoded withadditional instructions which, when executed by the at least oneprocessor, further cause the system to update the stored data toindicate a reduced availability of the second time slot to service APIcalls in response to generating the first data.

(S27) A system may be configured as described in paragraph (S25) orparagraph (S26), wherein the stored data may be further indicative of asecond estimated capacity of the system to process API calls during thesecond time slot, and the at least one computer-readable medium may befurther encoded with additional instructions which, when executed by theat least one processor, further cause the system to determine that thefirst estimated capacity is lower than the second estimated capacity,and to configure the first data to indicate that the client device isprohibited from sending an API call to the system during the first timeslot but is not prohibited from sending an API call to the system duringthe second time slot based at least on the first estimated capacitybeing lower than the second estimated capacity.

(S28) A system may be configured as described in any of paragraphs (S17)through (S27), and the at least one computer-readable medium may befurther encoded with additional instructions which, when executed by theat least one processor, further cause the system to determine afrequency with which the client device is to make API calls, todetermine a grace period for the client device to make web API callswith the frequency, and to generate the first data based at least inpart on the frequency and the grace period.

(S29) A system may be configured as described in any of paragraphs (S17)through (S28), wherein the first data may further indicate how theclient device is to go about sending an additional API call if anattempted API call fails.

The following paragraphs (CRM1) through (CRM29) describe examples ofcomputer-readable media that may be implemented in accordance with thepresent disclosure.

(CRM1) At least one non-transitory computer-readable medium may beencoded with instructions which, when executed by at least one processorof a system, cause the system to send to a client device, a first tokenthat includes first data indicating that the client device is authorizedto send an application programming interface (API) call to the systemduring at least a first time slot, the first token including a firstsignature based on the first data and a private key of the system, toreceive from the client device, a first API call during the first timeslot, to determine that the first API call includes a second token, thesecond token including second data and a second signature, to determine,using the private key and the second data, that the second signature isvalid, to determine that the second data indicates that the clientdevice was authorized to send the first API call to the system duringthe first time slot, and to process the first API call based at least inpart the second data indicating that the client device was authorized tosend the first API call to the system during the first time slot.

(CRM2) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM1), and may be further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to receive, from the client device,a second API call prior to receiving the first API call, to include thefirst token in a response to the second API call, and to send theresponse to the client device.

(CRM3) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM1) or paragraph (CRM2), and maybe further encoded with additional instructions which, when executed bythe at least one processor, further cause the system to generate thefirst data based at least in part on stored data that is indicative of afirst estimated capacity of the system to process API calls during thefirst time slot.

(CRM4) At least one non-transitory computer-readable medium may beencoded with instructions which, when executed by at least one processorof a system, cause the system to send, to a client device, first dataindicating that the client device is authorized to send an applicationprogramming interface (API) call to the system during at least a firsttime slot, to receive, from the client device, a first API call duringthe first time slot, and to process the first API call.

(CRM5) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM4), and may be further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to receive, from the client device,a second API call prior to receiving the first API call, to include thefirst data in a response to the second API call, and to send theresponse to the client device.

(CRM6) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM4) or paragraph (CRM5), and maybe further encoded with additional instructions which, when executed bythe at least one processor, further cause the system to determine thatthe first API call was received from the client device during the firsttime slot, and to determine to process the first API call based at leastin part on the first API call having been received during the first timeslot.

(CRM7) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM6), and may be further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to determine that the first API callincludes second data indicative of the client device having beenauthorized to send the first API call to the system during the firsttime slot, and to determine to process the first API call further basedat least on part on the second data being indicative of the clientdevice having been authorized to send the first API call to the systemduring the first time slot.

(CRM8) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM7), and may be further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to determine that the second datamatches the first data, and to determine to process the first API callfurther based at least on part on the second data matching the firstdata.

(CRM9) At least one non-transitory computer-readable medium may beconfigured as described in any of paragraphs (CRM4) through (CRM8), andmay be further encoded with additional instructions which, when executedby the at least one processor, further cause the system to generate atoken that represents the first data and includes a first signaturebased on the first data and a private key of the system, and to send thefirst data to the client device as a part of the token.

(CRM10) At least one non-transitory computer-readable medium may beconfigured as described in any of paragraphs (CRM4) through (CRM9), andmay be further encoded with additional instructions which, when executedby the at least one processor, further cause the system to determinethat the first API call includes a token, to determine that content ofthe token indicates that the client device has been authorized to sendthe first API call to the system during the first time slot, and todetermine to process the first API call further based at least on parton the content of the token indicating that the client device has beenauthorized to send the first API call to the system during the firsttime slot.

(CRM11) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM10), and may be further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to determine, using a private key ofthe system and content of the token, that a signature of the token isvalid, and to determine to process the first API call further based atleast on part on the signature of the token being valid.

(CRM12) At least one non-transitory computer-readable medium may beconfigured as described in any of paragraphs (CRM4) through (CRM11), andmay be further encoded with additional instructions which, when executedby the at least one processor, further cause the system to generate thefirst data based at least in part on stored data that is indicative of afirst estimated capacity of the system to process API calls during thefirst time slot.

(CRM13) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM12), and may be further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to update the stored data toindicate a reduced availability of the first time slot to service APIcalls in response to generating the first data.

(CRM14) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM12) or paragraph (CRM13),wherein the stored data may be further indicative of a second estimatedcapacity of the system to process API calls during a second time slot,and the at least one computer-readable medium may be further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to determine that the firstestimated capacity is greater than the second estimated capacity, and toconfigure the first data to indicate that the client device isauthorized to send an API call to the system during the first time slotrather than the second time slot based at least on the first estimatedcapacity being greater than the second estimated capacity.

(CRM15) At least one non-transitory computer-readable medium may beconfigured as described in any of paragraphs (CRM4) through (CRM14), andmay be further encoded with additional instructions which, when executedby the at least one processor, further cause the system to determine afrequency with which the client device is to make API calls, todetermine a grace period for the client device to make web API callswith the frequency, and to generate the first data based at least inpart on the frequency and the grace period.

(CRM16) At least one non-transitory computer-readable medium may beconfigured as described in any of paragraphs (CRM4) through (CRM14),wherein the first data may further indicate how the client device is togo about sending an additional API call if an attempted API call fails.

(CRM17) At least one non-transitory computer-readable medium may beencoded with instructions which, when executed by at least one processorof a system, cause the system to send, to a client device, first dataindicating that the client device is prohibited from sending an API callto the system during at least a first time slot, to receive from theclient device, a first API call during a second time slot that isdifferent than the first time slot, and to process the first API call.

(CRM18) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM17), and may be further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to receive, from the client device,a second API call prior to receiving the first API call, to include thefirst data in a response to the second API call, and to send theresponse to the client device.

(CRM19) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM17) or paragraph (CRM18), andmay be further encoded with additional instructions which, when executedby the at least one processor, further cause the system to determinethat the first API call was received from the client device outside ofthe first time slot, and to determine to process the first API callbased at least in part on the first API call having been receivedoutside of the first time slot.

(CRM20) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM19), and may be further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to determine that the first API callincludes second data indicative of the client device having beenprohibited from sending an API call to the system during the first timeslot but not having been prohibited from sending an API call to thesystem during the second time slot, and to determine to process thefirst API call further based at least on part on the second data beingindicative of the client device not having been prohibited from sendingan API call to the system during the second time slot.

(CRM21) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM20), and may be further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to determine that the second datamatches the first data, and to determine to process the first API callfurther based at least on part on the second data matching the firstdata.

(CRM22) At least one non-transitory computer-readable medium may beconfigured as described in any of paragraphs (CRM17) through (CRM21),and may be further encoded with additional instructions which, whenexecuted by the at least one processor, further cause the system togenerate a token that represents the first data and includes a firstsignature based on the first data and a private key of the system, andto send the first data to the client device as a part of the token.

(CRM23) At least one non-transitory computer-readable medium may beconfigured as described in any of paragraphs (CRM17) through (CRM22),and may be further encoded with additional instructions which, whenexecuted by the at least one processor, further cause the system todetermine that the first API call includes a token, to determine thatcontent of the token indicates that the client device has beenprohibited from sending an API call to the system during the first timeslot but has not been prohibited from sending an API call to the systemduring the second time slot, and to determine to process the first APIcall may be further based at least on part on the content of the tokenindicating that the client device has not been prohibited from sendingan API call to the system during the second time slot.

(CRM24) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM23), and may be further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to determine, using a private key ofthe system and content of the token, that a signature of the token isvalid, and to determine to process the first API call further based atleast on part on the signature of the token being valid.

(CRM25) At least one non-transitory computer-readable medium may beconfigured as described in any of paragraphs (CRM17) through (CRM24),and may be further encoded with additional instructions which, whenexecuted by the at least one processor, further cause the system togenerate the first data based at least in part on stored data that isindicative of a first estimated capacity of the system to process APIcalls during the first time slot.

(CRM26) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM25), and may be further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to update the stored data toindicate a reduced availability of the second time slot to service APIcalls in response to generating the first data.

(CRM27) At least one non-transitory computer-readable medium may beconfigured as described in paragraph (CRM25) or paragraph (CRM26),wherein the stored data may be further indicative of a second estimatedcapacity of the system to process API calls during the second time slot,and the at least one computer-readable medium may be further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to determine that the firstestimated capacity is lower than the second estimated capacity, and toconfigure the first data to indicate that the client device isprohibited from sending an API call to the system during the first timeslot but is not prohibited from sending an API call to the system duringthe second time slot based at least on the first estimated capacitybeing lower than the second estimated capacity.

(CRM28) At least one non-transitory computer-readable medium may beconfigured as described in any of paragraphs (CRM17) through (CRM27),and may be further encoded with additional instructions which, whenexecuted by the at least one processor, further cause the system todetermine a frequency with which the client device is to make API calls,to determine a grace period for the client device to make web API callswith the frequency, and to generate the first data based at least inpart on the frequency and the grace period.

(CRM29) At least one non-transitory computer-readable medium may beconfigured as described in any of paragraphs (CRM17) through (CRM28),wherein the first data may further indicate how the client device is togo about sending an additional API call if an attempted API call fails.

Having thus described several aspects of at least one embodiment, it isto be appreciated that various alterations, modifications, andimprovements will readily occur to those skilled in the art. Suchalterations, modifications, and improvements are intended to be part ofthis disclosure, and are intended to be within the spirit and scope ofthe disclosure. Accordingly, the foregoing description and drawings areby way of example only.

Various aspects of the present disclosure may be used alone, incombination, or in a variety of arrangements not specifically discussedin the embodiments described in the foregoing and is therefore notlimited in this application to the details and arrangement of componentsset forth in the foregoing description or illustrated in the drawings.For example, aspects described in one embodiment may be combined in anymanner with aspects described in other embodiments.

Also, the disclosed aspects may be embodied as a method, of which anexample has been provided. The acts performed as part of the method maybe ordered in any suitable way. Accordingly, embodiments may beconstructed in which acts are performed in an order different thanillustrated, which may include performing some acts simultaneously, eventhough shown as sequential acts in illustrative embodiments.

Use of ordinal terms such as “first,” “second,” “third,” etc. in theclaims to modify a claim element does not by itself connote anypriority, precedence or order of one claim element over another or thetemporal order in which acts of a method are performed, but are usedmerely as labels to distinguish one claimed element having a certainname from another element having a same name (but for use of the ordinalterm) to distinguish the claim elements.

Also, the phraseology and terminology used herein is used for thepurpose of description and should not be regarded as limiting. The useof “including,” “comprising,” or “having,” “containing,” “involving,”and variations thereof herein, is meant to encompass the items listedthereafter and equivalents thereof as well as additional items.

What is claimed is:
 1. A method, comprising: sending, at a first timeand from a computing system to a client device, a first token thatincludes first data identifying a starting time and an ending time of atleast a first time slot during which the client device is authorized tosend an application programming interface (API) call to the computingsystem, wherein the starting time is later than the first time, and thefirst token includes a first signature based on the first data and aprivate key of the computing system; receiving, by the computing systemand from the client device, a first API call during the first time slot;determining, by the computing system, that the first API call includes asecond token, the second token including second data and a secondsignature; determining, by the computing system and using the privatekey and the second data, that the second signature is valid;determining, by the computing system, that the second data indicatesthat the client device was authorized to send the first API call to thecomputing system during the first time slot; and processing, by thecomputing system, the first API call based at least in part the seconddata indicating that the client device was authorized to send the firstAPI call to the computing system during the first time slot.
 2. Themethod of claim 1, further comprising: receiving, by the computingsystem and from the client device, a second API call prior to receivingthe first API call; including, by the computing system, the first tokenin a response to the second API call; and sending, by the computingsystem, the response to the client device.
 3. The method of claim 1,further comprising: generating, by the computing system, the first databased at least in part on stored data that is indicative of a firstestimated capacity of the computing system to process API calls duringthe first time slot.
 4. A method, comprising: sending, at a first timeand from a computing system to a client device, first data identifying astarting time and an ending time of at least a first time slot duringwhich the client device is authorized to send an application programminginterface (API) call to the computing system, wherein the starting timeis later than the first time; receiving, by the computing system andfrom the client device, a first API call; determining, by the computingsystem, that the first API call includes a first token; determining, bythe computing system, that the first token includes second datacorresponding to the first data; determining, by the computing system,that the second data indicates the client device was authorized to sendthe first API call to the computing system during the first time slot;determining, by the computing system, that the first API call wasreceived from the client device during the first time slot; andprocessing, by the computing system, the first API call based at leaston part on the first API call having been received during the first timeslot and the second data indicating that the client device wasauthorized to send the first API call to the computing system during thefirst time slot.
 5. The method of claim 4, further comprising:receiving, by the computing system and from the client device, a secondAPI call prior to receiving the first API call; including, by thecomputing system, the first data in a response to the second API call;and sending, by the computing system, the response to the client device.6. The method of claim 4, further comprising: determining, by thecomputing system, that the second data matches the first data; anddetermining to process the first API call based at least on part on thesecond data matching the first data.
 7. The method of claim 4, furthercomprising: generating, by the computing system, a second token thatrepresents the first data and includes a first signature based on thefirst data and a private key of the computing system; wherein thecomputing system sends the first data to the client device as a part ofthe second token.
 8. The method of claim 7, further comprising:determining, by the computing system and using the private key and thesecond data included in the first token, that a second signature of thefirst token is valid; wherein determining to process the first API callis further based at least on part on the second signature of the firsttoken being valid.
 9. The method of claim 4, further comprising:generating, by the computing system, the first data based at least inpart on stored data that is indicative of a first estimated capacity ofthe computing system to process API calls during the first time slot.10. The method of claim 9, further comprising: in response to generatingthe first data, updating the stored data to indicate a reducedavailability of the first time slot for servicing API calls.
 11. Themethod of claim 9, wherein the stored data is further indicative of asecond estimated capacity of the computing system to process API callsduring a second time slot different than the first time slot, and themethod further comprises: determining, by the computing system, that thefirst estimated capacity is greater than the second estimated capacity;and configuring, by the computing system and based at least on the firstestimated capacity being greater than the second estimated capacity, thefirst data to indicate that the client device is authorized to send anAPI call to the computing system during the first time slot rather thanthe second time slot.
 12. The method of claim 4, further comprising:determining a frequency with which the client device is to make APIcalls; determining a grace period for the client device to make APIcalls with the frequency; and generating, by the computing system, thefirst data based at least in part on the frequency and the grace period.13. The method of claim 4, wherein the first data further indicates howthe client device is to go about sending an additional API call if anattempted API call fails.
 14. A system, comprising: at least oneprocessor; and at least one non-transitory computer-readable mediumencoded with instructions which, when executed by the at least oneprocessor, cause the system to: generate a first token that representsfirst data and includes a first signature based on the first data and aprivate key of the system, the first data identifying a starting timeand an ending time of a first time slot during which a client device isauthorized to send an application programming interface (API) call tothe system; send, at a first time and to the client device, the firsttoken, wherein the starting time is later than the first time; receive,from the client device, a first API call; determine that the first APIcall includes second data corresponding to the first data; determinethat the second data indicates the client device was authorized to sendthe first API call to the system during the first time slot; determinethat the first API call was received from the client device during thefirst time slot; and process, by the system, the first API call based atleast on part on the first API call having been received during thefirst time slot and the second data indicating that the client devicewas authorized to send the first API call to the system during the firsttime slot.
 15. The system of claim 14, wherein the at least onenon-transitory computer-readable medium is further encoded withadditional instructions which, when executed by the at least oneprocessor, further cause the system to: receive, from the client device,a second API call prior to receiving the first API call; include thefirst token in a response to the second API call; and send the responseto the client device.
 16. The system of claim 14, wherein the at leastone non-transitory computer-readable medium is further encoded withadditional instructions which, when executed by the at least oneprocessor, further cause the system to determine that the first API callincludes the second data at least in part by: determining that the firstAPI call includes a second token; and determining that the second tokenincludes the second data.
 17. The system of claim 16, wherein the atleast one non-transitory computer-readable medium is further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to: determine, using the private keyand the second data included in the second token, that a secondsignature of the second token is valid; and determine to process thefirst API call based at least on part on the second signature of thesecond token being valid.
 18. The system of claim 14, wherein the atleast one non-transitory computer-readable medium is further encodedwith additional instructions which, when executed by the at least oneprocessor, further cause the system to: generate the first data based atleast in part on stored data that is indicative of a first estimatedcapacity of the system to process API calls during the first time slot.19. The system of claim 18, wherein the at least one non-transitorycomputer-readable medium is further encoded with additional instructionswhich, when executed by the at least one processor, further cause thesystem to: in response to generating the first data, update the storeddata to indicate a reduced availability of the first time slot forservicing API calls.
 20. The system of claim 18, wherein the stored datais further indicative of a second estimated capacity of the system toprocess API calls during a second time slot different than the firsttime slot, and the at least one non-transitory computer-readable mediumis further encoded with additional instructions which, when executed bythe at least one processor, further cause the system to: determine thatthe first estimated capacity is greater than the second estimatedcapacity; and based at least on the first estimated capacity beinggreater than the second estimated capacity, configure the first data toindicate that the client device is authorized to send an API call to thesystem during the first time slot rather than the second time slot.